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Abstract 

We consider two-person sports where each raUy is initiated by a 
server, the other player (the receiver) becoming the server when he/she 
wins a rally. Historically, these sports used a scoring based on the 
side-out scoring system, in which points are only scored by the server. 
Recently, however, some federations have switched to the rally-point 
scoring system in which a point is scored on every rally. As various 
authors before us, we study how much this change affects the game. 
Our approach is based on a rally-level analysis of the process through 
which, besides the well-known probability distribution of the scores, we 
also obtain the distribution of the number of rallies. This yields a com- 
prehensive knowledge of the process at hand, and allows for an in-depth 
comparison of both scoring systems. In particular, our results help to 
explain why the transition from one scoring system to the other has 
more important implications than those predicted from game-winning 
probabilities alone. Some of our findings are quite surprising, and 
unattainable through Monte Carlo experiments. Our results are of 
high practical relevance to international federations and local tourna- 
ment organizers alike, and also open the way to efficient estimation of 
the rally-winning probabilities, which should have a significant impact 
on the quality of ranking procedures. 

Keywords. Combinatorial derivations; Duration analysis; Point estimation; Rank- 
ing procedures; Scoring rules; Two-person sports 

1 Introduction. 

We consider a class of two-person sports for which each rally is initiated by a 
server — the other player is then called the receiver — and for which the rules and 
scoring system satisfy one of the following two definitions. 

*E.C.A.R.E.S. and Departement de Mathematique, av. F.D. Roosevelt 50 - CP114, 
B-1050 Brussels 

^Departement de Mathematique, Campus Plaine - CP 210, B-1050 Brussels 



Side-out scoring system: (i) the server in the first rally is determined 
by flipping a coin, (ii) If a rally is won by the server, the latter scores 
a point and serves in the next rally. Otherwise, the receiver becomes 
the server in the next rally, but no point is scored, (iii) The winner of 
the game is the first player to score n points. 

Rally-point scoring system: (i) the server in the first rally is determined 
by flipping a coin, (ii) If a rally is won by the server, the latter serves 
in the next rally. Otherwise, the receiver becomes the server in the 
next rally. A point is scored after each rally, (iii) The winner of the 
game is the first player to score n points. 

A match would typically consist of a sequence of such games, and the winner 
of the match is the first player to win M games. Actually, it is usually so that 
in game m > 2, the first server is not determined by flipping a coin, but rather 
according to some prespecified rule: the most common one states that the first 
server in game m is the winner in game m — 1, but alternatively, the players might 
simply take turns as the first server in each game until the match is over. It turns 
out that, in the probabilistic model we consider below, the probability that a fixed 
player wins the match is the same under both rules; see Anderson (1977). This 
clearly allows us to focus on a single game in the sequel — as in most previous works 
in the field (references will be given below) . Extensions of our results to the match 
level can then trivially be obtained by appropriate conditioning arguments, taking 
into account the very rule adopted for determining the first server in each game. 

The side-out scoring system has been used in various sports, sometimes up to 
tiny unimportant refinements, involving typically, in case of a tie at n — 1, the 
possibility (for the receiver) to choose whether the game should be played to n + £ 
(for some fixed i > 2) ot to n; see Section |2] When based on the so-called English 
scoring system. Squash currently uses {n,M) = (9,3). Racquetball is essentially 
characterized by {n,M) — (15,2) (the possible third game is actually played to 
11 only). Until 2006, Badminton was using (n,M) = (15,2) and (n,M) = (11,2) 
for men's and women's singles, respectively — with an exception in 2002, where 
(n, M) = (7, 3) was experienced. Volleyball, for which the term persons above 
should of course be understood as teams, was based on (n, M) = (15, 3) until 2000. 
In both badminton and volleyball, this scoring system was then replaced with the 
rally-point system. Similarly, squash, at the international level, now is based on 
the American version of its scoring system, which is nothing but the rally-point 
system, in this case with (n,M) = (11,3). Investigating the deep implications of 
this transition from the original side-out scoring system to the rally-point scoring 
system was one of the main motivations for this work; see Section |4j 

Irrespective of the scoring system adopted, the most common probabilistic 
model for the sequence of rallies assumes that the rally outcomes are mutually 
independent and are, conditionally on the server, identically distributed. This im- 
plies that the game is governed by the parameter {pa,Pb) € [0,1] x [0,1], where 
the rally-winning probability pa (resp., ph) is the probability that Player A (resp.. 
Player B) wins a rally when serving. This means that players do not get tired 
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through the match, or that they do not get nervous when playing crucial points. 
More precisely, they might get tired or nervous, but if they do, they should do so at 
the same moment and to the same extent, so that it does not affect their respective 
rally-winning probability. We will throughout refer to this probabilistic model as 
the server model, in contrast with the no-server model in which any rally is won 
by A with probability p irrespective of the server, that is, the submodel obtained 
when taking p — pa = \ — Pb- 

The probabilistic properties of a single game played under the side-out scoring 
system have been investigated in various works. Hsi and Burich (1971) attempted to 
derive the probability distribution of game scores — in the sequel, we simply speak of 
the score distribution — in terms oipa and ph, but their derivation based on standard 
combinatorial arguments was wrong. The correct score distribution (hence also the 
resulting game- winning probabilities) was first obtained in Phillips (1978) by apply- 
ing results on sums of random variables having the modified geometric distribution. 
Keller (1984) derived probabilities of very extreme scores, whereas Marcus (1985) 
derived the complete score distribution in the no-server model. Strauss and Arnold 
(1987), by identifying the point earning process as a Markov chain, obtained more 
directly the same general result as Phillips (1978). They further used the score dis- 
tribution to define maximum likelihood estimators and moment estimators of the 
rally- winning probabilities (both in the server and no-server models), and based 
on these estimates a ranking system (relying on Bradley- Terry paired comparison 
methods) for the players of a league or tournament. Simmons (1989) determined 
the score distribution under the two scoring systems, this time by using a quick 
and direct combinatorial analysis of a single game. He discussed handicapping and 
strategies (for deciding whether the receiver should go for a game played to n + i 
or not in case of a tie at n — 1), and attempted a comparison of the two scoring 
systems. More recently, Percy (2009) used Monte Carlo simulations to compare 
game-winning probabilities and expected durations for both scoring systems in the 
no-server model. 

To sum up, the score distributions have been obtained through several differ- 
ent probabilistic methods, and were used to discuss several aspects of the game. 
In contrast, the distribution of the number of rallies needed to complete a single 
game {D, say) remains virtually unexplored for the side-out scoring system (for the 
rally-point scoring system, the distribution of D is simply determined by the score 
distribution). To the best of our knowledge, the only theoretical result on D under 
the side-out scoring system provides lower and upper bounds for the expected value 
of D; see (20) in Simmons (1989), or ^ below. Beyond the lack of exact results 
on D (only approximate theoretical results or simulation-based results are available 
so far), it should be noted that only the expected value of D has been studied in 
the literature. This is all the more surprising because, in various sports (e.g., in 
badminton and volleyball), uncertainty about D — which is related to its variance, 
not to its expected value — was one of the most important arguments to switch from 
the side-out scoring system to the rally-point scoring system. Exact results on the 
moments of D — or even better, its distribution — are then much desirable as they 
would allow to investigate whether the transition to the rally-point system indeed 
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reduced uncertainty about D. More generally, precise results on the distribution 
of D would allow for a much deeper comparison of both scoring systems. They 
would also be of high practical relevance, e.g., to tournament organizers, who need 
planning their events and deciding in advance the number of matches — hence the 
number of players — the events will be able to host. 

For the side-out scoring system, however, results on the distribution of D can- 
not be obtained from a point-level analysis of the game. That is the reason why 
the present work rather relies on a rally-level combinatorial analysis. This allows 
to get of rid of the uncertainty about the number of rallies needed to score a single 
point, and results into an exact computation of the distribution of D — and actually, 
even of the number of rallies needed to achieve any fixed score. We derive explicitly 
the expectation and variance of D, and use our results to compare the two scor- 
ing systems not only in terms of game-winning probabilities, but also in terms of 
durations. Our results reveal significant differences between both scoring systems, 
and help to explain why the transition from one scoring system to the other has 
more important implications than those predicted from game- winning probabilities 
alone. As suggested above, they could be used by tournament organizers to plan 
accurately their events, but also by national or international federations to better 
perform the possible transition from the side-out scoring system to the rally-point 
one; see Section[6]for a discussion. Finally, our results open the way to efhcient esti- 
mation of the rally- winning probabilities (based on observed scores and durations), 
which might have important consequences for the resulting ranking procedures, 
since rankings usually are to be based on small numbers of "observations" (here, 
games) . 

The outline of the paper is as follows. In Section [2j we describe our rally-level 
analysis of a single game played under the side-out scoring system, and show that 
it also leads to the score distribution already derived in Phillips (1978), Strauss and 
Arnold (1987), and Simmons (1989). Section [s] explains how this rally-level analysis 
further provides (i) the expectation and variance of the number of rallies needed to 
achieve a fixed score (Section |3.1[ ) and also (ii) the corresponding exact distribution 
(Section 3.2 1. In Section [4j we then use our results in order to compare the side- 
out and rally-point scoring systems, both in terms of game-winning probabilities 
(Section 4.1) and durations (Section 4.2 1. In Section [5] we perform Monte Carlo 
simulations and compare the results with our theoretical findings. Section[6]presents 
the conclusion and provides some final comments. Finally, an appendix collects 
proofs of technical results. 



2 Rally-level derivation of the score distribution 
under the side-out scoring system. 

In this section, we conduct our rally-level analysis of a single game played un- 
der the side-out scoring system. We will make the distinction between j4-games 
and i?-games, with the former (resp., the latter) being defined as games in which 
Player A (resp.. Player B) is the first server. Wherever possible, we will state our 
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results/definitions in the context of A-games only; in such cases the correspond- 
ing results/definitions for i?-games can then be obtained by exchanging the roles 
played by A and B, that is, by exchanging (i) pa and pb and (ii) the number of 
points scored by each player. Whenever not specified, the server S will be consid- 
ered random, and we will denote by Sa ■= P[S = A] and Sf, := P[S = B] = 1 — Sa the 
probabilities that the game considered is an A-game and a i3-game, respectively. 
This both covers games where the first server is determined by flipping a coin and 
games where the first server is fixed (by letting Sa G {0, 1}). 

Our rally-level analysis of the game will be based on the concepts of interrup- 
tions and exchanges first introduced in Hsi and Burich (1971). More precisely, we 
adopt 

Definition 2.1 An A-interruption is a sequence of rallies in which B gains the 
right to serve from A, scores at least one point, then (unless the game is over) 
relinquishes the service back to A, who will score at least one point. An exchange is a 
sequence of two rallies in which one player gains the right to serve, but immediately 
loses this right before he/she scores any point (so that the potential of consecutive 
scoring by his/her opponent is not interrupted). 



We point out that A-interruptions are characterized in terms of score changes 
only (and in particular may contain one or several exchanges) and that, at any 
time, an exchange clearly occurs with probability q := qaqb '■= (1 — Pa)(l — Pb)- 

Now, for C e {A,B}, denote by i?"'^''-^(r, j) the event associated with a se- 
quence of rallies that (i) gives raise to a points scored by Player A and /3 points 
scored by Player B, (ii) involves exactly r ^-interruptions and j exchanges, and (iii) 
is such that Player C scores a point in the last rally; the superscript C therefore 
indicates who is scoring the last point, and it is assumed here that a > (resp., 
/? > 0) if C = A (resp., if C = B). We will write 

p^f '^^(r,j) P[E"^P'''^{r,j)\S = Ci], Ci,^^ G {A,B}. 
We then have the following result (see the Appendix for the proof). 

Lemma 2.1 Let 70 min{/3, 1}, 71 := minjo!,/?}, and 72 minja,/? — 1}. 
Then, setting {z\) 1, we have p^/^^{r,j) = ("+^+-'-') (^j r e 

{70,. ..,71}, J e N, andp'^/'''{r,j) = ("+^+^"-^) (,^) (^!:>>^g.<z'-+^-\ r G 
{1,...,72 + 1}, J gN. 



By taking into account all possible values for the numbers of A-interruptions and 
exchanges. Lemma 2.1 quite easily leads to the following result (see the Appendix 
for the proof), which then trivially provides the score distribution in an A-game, 
hence also the corresponding game-winning probabilities. 
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Theorem 2.1 Letp'^f'^^ := P[E°'^^^^^\S = Ci], where E°'-'^-^^ := Urj E°'-'^-^Hr, j) , 
with CuC2 e {A,B}. Thenp'^/^^ = ^^^A^ Y^rl,, O {t\y and p'^/^'' = 

(l-q)a+p l^r=l \r-l)\r-l)'i 

In the sequel, we denote game scores by couples of integers, where the first entry 
(resp., second entry) stands for the number of points scored by Player A (resp., by 
Player B). With this notation, a C-game ends on the score {n,k) (resp., {k,n)), 
k € {0, 1, . . . ,n — 1}, with probability p'^ ' (resp., Pq"' ), hence is won by A 
(resp., by B) with the (game-winning) probability 

n-l 
k=0 

(resp., := 1 -p^); throughout, E"^ := UlZ^E"'''^'^ (resp., E^ := U'^Z^E''''''^) 
denotes the event that the game — irrespective of the initial server — is won by A 
(resp., by B). Of course, unconditional on the initial server, we have 

and 

p^ := P[E^] ^ p'^Sa + P%3b, 

for Ce {A,B}. 

Figures[I|a)-(b) present, for n = 15, the score distributions associated with {pa,Pb) — 
(.7, .5), (.6, .5), (.5, .5), and (.4, .5). We reversed the fc-axis in Figure [T]^b) , since, 
among all scores associated with a victory of B, the score (14,15) can be considered 
the closest to the score (15,14) (associated with a victory of A). It then makes 
sense to regard Figures [TJa)-(b) as a single plot. The resulting "global" probability 
curves are quite smooth and, as expected, unimodal (with the exception of the 
Pa ^ Pb ~ -5 curve, which is slightly bimodal). It appears that these score distri- 
butions are extremely sensitive to {pa,Pb)i as are the corresponding game- winning 
probabilities {p^ ranges from .94 to .22, when, for fixed pb = .5, Pa goes from 
.7 to .4). For Pa — Pb — -5, we would expect the global probability curve to be 
symmetric. The advantage Player A is given by serving first in the game, how- 
ever, makes this curve slightly asymmetric; this is quantified by the corresponding 
probability that A wins the game, namely p^ = .53 > .47 — p^. 

As mentioned in the Introduction, sports based on the side-out scoring system 
may involve tie-breaks in case of a tie at n — 1. This means that, at this tie, the 
receiver has the option of playing through to n or "setting to (for a fixed £ >2), 
in which case the winner is the first player to score £ further points. For instance, 
games in the current side-out scoring system for squash are played to n = 9 points, 
and the receiver, at (8, 8), may decide whether the game is to 9 or 10 points {£ — 2). 
Before the transition to the rally-point system in 2006, similar tie-breaks were used 
in badminton, there with n = 15 and £ = Assuming that the game is always set 
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to (, in case of tie at n — 1, the resulting score distribution can then be easily derived 
from Theorem 2.1 by appropriate conditioning; for instance, the score (n+£— 1, n + 
A:— l),fc€{0,l,...,^— 1} occurs in an A-game with probability p'^^'^~^"'^p^J^"^ + 
^n-i,n-i,B ^i^,A ^ Wc strcss that all results we derive in the later sections can also be 
extended to scoring systems involving tie-breaks, again by appropriate conditioning. 
Finally, various papers discuss tie-break strategies (whether to play through or 
to set the game to t) on the basis of Pa and pf,; see, e.g., Renick (1976, 1977), 
Simmons (1989), or Percy (2009). 



3 Distribution of the number of rallies under the 
side-out scoring system. 

As mentioned in the Introduction, the literature contains few results about the 
number of rallies D to complete a single game played under the side-out scoring 
system. Of course, the distribution of D can always be investigated by simula- 
tions; see, e.g., Percy (2009), where Monte Carlo methods are used to estimate the 
expectation of D for a broad range of rally-winning probabilities in the no-server 
model. To the best of our knowledge, the only available theoretical result is due 
to Simmons (1989), and provides lower and upper bounds on the expeetation of D 
in an A-game conditional on a victory of A on the score (n, fc). More specifically, 
letting 

e-^f'^' :^nD\E'''^-<'-,S^C,], C,,C2e{A,B}, (1) 
Simmons' result states that 

(„ + fc) 1+2 < e^^^'^'^ < (n + fc) 1+2 + 2fc, fc = 0,1,..., 71-1. (2) 

Unless a shutout is considered (that is, fc = 0), this is only an approximate result, 
whose accuracy quickly decreases with fc. Again, the reason why no exact results 
are available is that all analyses of the game in the literature are of a point-level 
nature. In sharp contrast, our rally-level analysis allows, inter alia, for obtaining 
exact values of all moments of £), as well as its complete distribution. 



3.1 Moments. 

We first introduce the following notation. Let i?^'^'"^ (resp., i?^'^'^) be a random 
variable assuming values r = 70, 70 -I- 1, . . . , 71 (resp., r — 1, 2, . . . , 72 + 1) with 
corresponding probabilities W^^^'^iq^r) := (^)(^:J)(77 [EJl^, OltD^I (^sp., 

W2^^^^iq,r) := (,^) (^Zl)?'- V ES' C-i) (f:!)?^"'])- Conditioning with re- 
spect to the number of A-interruptions and exchanges then yields the following 
result (see the Appendix for the proof). 



7 



Theorem 3.1 Let t ^ M^f'^^ (t) ^ E[e*^ \ E°'^'^-^\ S = Ci], Ci,C2 £ {A, B} , be 
the moment generating function of D conditional on the event E'^'^''^'^ H [S* = Ci], 
and let 5ci.C2 — ^ if Ci = C2 and otherwise. Then 

'(1 -q)e*\"+/3. 



qe 



2t 



E[e 



■i5b,c)i 



forC£{A,B}. 



Quite remarkably, those moment generating functions (hence also all resulting 
moments) depend on {paiPb) through q — (1 — pa){l — Pb) only. Taking first and 
second derivatives with respect to t in the above expressions and setting t = Q then 
directly yields the following closed form expressions for the expected values e'^^''~^^ 
from ([1]) and for the corresponding variances 

Var[i5|£;"^^^^%5 = Ci], Ci,C2 G {A,B}. 



Corollary 3.1 For C G {A,B}, we have (i) e^' 



(a + /3) i±f - 5b,c + 
2E[i?;^"^'^] and (ii)v'X^^^ = A{a^p)j^^AY&r[R''/'^]. Moreover, (tii) e"^^'^ 
is strictly monotone increasing in q. 



Clearly, Corollary |3.1| confirms Simmons' result that the expected number of 



rallies in an A-game won by A on the score (n, k) is e^' ' — n for k = 
interestingly, it also shows that the exact value for any fc > is given by 



.,k,A 



1. 



0. More 



(3) 



Note that this is compatible with Simmons' result in ([2| since the second term in the 
right-hand side of ([s]) is a weighted mean of 2r, r = 1, . . . , k. Similarly, the expected 
number of rallies in an A-game won by B on the score {k,n), fc = 0, 1, . . . , n — 1, is 



The expectation and variance of D, in a C-game won by A, are then given by 



^D\E'^,S^ C] 

V&t[D\E^,S ::^C] 



En— 1 n, 
k=0 Pc 



k.A 



,k.A 



1 v^n— 1 n,k,A/ n,k,A . / 



C 



while, in a C-game unconditional on the winner, they are given by 
r ec:^nD\S = C]=p^e^+p^eS, 

\ vc y^r[D\S = C] = {v^ + {e^f)p^ + {vg + {e^f)p^ - 
Finally, unconditional on the server, this yields 



{ecf 



(4) 



(5) 



= E[i:i|£;'^] = e'Xsa + e^Sb, e ¥.[D] = CASa + egSb, 



= Var[7^|i?^] = {vi + {eif )sa + K + {^^?)sb - (e^)' 
: Var[D] = {VA + e\)sa + {vg + e%)sb - e^. 



(6) 
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Figures [l|c)-(f) plot, for n = 15, e^'^''^, e^'"'^, (i;^'''''^)-^/^, and (w^'"'"^)-^/^ ver- 
sus k for (paTPb) = (-7, .5), (.6, .5), (.5, .5), and (.4, .5), and report the correspond- 
ing numerical values of e^, e^, ca, (v^)^^'^, ^-nd '^a^- expectation and 
standard deviation curves appear to be strictly monotone increasing functions of 
the number (n + k) of points scored, which was maybe expected. More surprising is 
the fact that — if one discards very small values of k — these curves are also roughly 
linear. Clearly, Simmons' lower and upper bounds ([2]), which are plotted versus 
k in Figure [IJc), only provide poor approximations of the exact expected values, 
particularly so for large k. 

The dependence on {pa,Pb) may be more interesting than that on k. Note that, 
for each k, e^''' and e^'"'^ (hence also, e^, e^, and ca) are decreasing functions 
of Pa, which confirms Corollary |3 . 1 [ iii) . Similarly, all quantities related to standard 
deviations also seem to be decreasing functions of Pa- Now, it is seen that, as a 
function of p^, the expectation is more spread out than e^. Indeed, the former 
ranges from 32.95 {pa = .7) to 56.30 {pa = .4), whereas the latter ranges from 41.95 
to 51.43. On the contrary, the standard deviation of D is more concentrated in an 
A-game won by A (where it ranges from 8.34 {pa = .7) to 10.90 {pa — .4)) than 
in an ^-game won by B (where it ranges from 7.36 to 11.44). This phenomenon 
will appear even more clearly in Figure below, where the same values of {pa,Pb) 
are considered. Note that the values of e^, e^, and are totally in line with the 
score distribution and the expected values of D for each scores. For instance, the 
value = 41.95 for pa = .7 translates the fact that when B wins such an A-game, 
it is very likely (see Figure [IJb)) that he/she will do so on a score that is quite 
tight, resulting on a large expected value for D (whereas, a priori, the values of 
e^'"' range from 47.82 to 21.29 when k goes from 14 to 0). The dependence of the 
expectation and standard deviation of D on rally- winning probabilities will further 
be investigated in Section [4] for the no-server model when comparing the side-out 
scoring system with its rally-point counterpart. 

Finally, in the case Pa — Pt — -5, the fact that A is the first server in the game 
again brings some asymmetry in the expected values and standard deviations of D; 
in particular, this serve advantage alone is responsible for the fact that 48.31 = < 

= 49.17, and, maybe more mysteriously, that 10.23 = (w^)^^^ > ("D^^^ = 9-95. 

3.2 Distribution. 

The moment generating functions given in Theorem |3.1| allow, through a suitable 
change of variables, for obtaining the corresponding probability generating func- 
tions. These can in turn be rewritten as power series whose coefficients yield the 
distribution of D conditional on the event E"'^''^ O [S = A] (see the Appendix for 
the proof). 

Theorem 3.2 Let z ^ Gcf'^^z) = E[z^ \ E"'''^'^^ S = Ci], Ci,C2 G {A,B}, be 
the probability generating function of D conditional on the event E"'^''^'^ H [5 = Ci] . 
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Then, forC £ {A,B}, 



Pa ■' 

where, writing ni^ := max(m,0), we let 
and 



This result gives the probabihty distribution of D, conditional on E°''^''~'r\ [S = 
A], for C e {A,B}. Note that, as expected, we have P[D = d\E°'^^''^,S = A] = 
= P[D = d+l\ S" = A] for all d < a + /3. Moreover, for all nonnegative 

integer j, = a+/3+2j + l | = A] = = P[D = a+l3+2j \ E"'P'^,S = 

A]. In the sequel, we refer to this as the server-effect. 



Theorem 3.2 of course allows for investigating the shape of the distribution of D 
above all scores, and not only, as in Figures [T|c)-(f), its expectation and standard 
deviation. This is what is done in Figure [2] which plots, as a function of the score, 
quantiles of order a = .01, .05, .25, .5, .75, .95, and .99 for {pa,Pb) — (.6, .5). For 
each a, two types of quantiles are reported, namely (i) the standard quantile qa ■ — 
mi{d : P[D < d \ E"''^'^ , S = A] > a} and (ii) an interpolated quantile, for which 
the interpolation is conducted linearly over the set (d, d-\-2) containing the expected 
quantile (here, we avoid interpolating over (d, d + 1) because of the above server- 
effect, which implies that either d or d + 1 does not bear any probability mass). 
One of the most prominent features of Figure [2] is the wiggliness of the standard 
quantile curves, which is directly associated with the server-effect. It should be 
noted that the expectation curves (which are the same as in Figures [l];c)-(d)) stand 
slightly above the median curves, which possibly indicates that, above each score, 
the conditional distribution of D is somewhat asymmetric to the right. This (light) 
asymmetry is confirmed by the other quantiles curves. 

Now, the probability distribution of D in an A-game, unconditional on the score, 
is of course derived trivially from its conditional version obtained above and the 
score distribution of Section [2j The general form of this distribution is somewhat 
obscure (and will not be explicitly given here), but it yields easily interpretable 
expressions for small values of d. For instance, one obtains 

P[D = n\S = A]=p:, 
P[D = n + l\S = A]=qap'S, 
P[D = n + 2\S = A]= nqp2 + PaQaPb, ■ ■ ■ 

Finally, the unconditional distribution of D is simply obtained through P[D ~ k] ~ 
P[D = k\S = A]sa + P[D = k\S = B]si,, k > 0, where one computes the distribution 
for a B-game by inverting pa and pb in the distribution for an A-game. 
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Figure [3] shows that there are a number of remarkable aspects to these distri- 
butions. First note the influence of the above mentioned server-effect, which causes 
the wiggliness visible in most curves there. Also note that the distributions in 
Figure [sj^c) are much less wiggling than the corresponding curves in Figures [sj^a)- 
(b). As it turns out, this wiggliness is present, albeit more or less markedly, at all 
stages (that is, not only to the right of the mode) for every choice of {pa,Pb)- Most 
importantly, despite their irregular aspect, all curves are essentially unimodal, as 
expected. Now, consider the dependence on pa of the position and spread of these 
curves. One sees that while their spread clearly increases much more rapidly with pa 
in Figure [Sf^b) than in Figure |3][a) , the opposite can be said for their mode. This 
is easily understood in view of the corresponding means and variances, which are 
recalled in the legend boxes (and coincide with those from Figure [T]) . As for the 
curves in Figure [sjjc), they are obtained by averaging the corresponding curves in 
Figure [sjja) and Figure [sjb) with weights p^ and p^ — 1—p^, respectively. Taking 
into account the values of these probabilities explains why the curves with pa = .7 
and Pa = .6 are essentially the corresponding curves in Figure [sf^ a), whereas that 
with Pa = .4 is closer to the corresponding curve in Figure pi b). 



4 Comparison with the rally-point scoring system. 

One of the main motivations for this work was to compare more deeply the side-out 
scoring system considered in Sections [2] and [3] with the rally-point scoring system. 
As mentioned in the Introduction, many sports recently switched (e.g., badminton, 
volley-ball) — or are in the process of switching (e.g., squash) — from the side-out 
scoring system to its rally-point counterpart, whereas others (e.g., racquetball) so 
far are sticking to the side-out scoring system. It is therefore natural to investigate 
the implications of the transition to the rally-point system. 

The literature, however, has focused on the impact of the scoring system on the 
outcome of the game — studied by comparing the game- winning probabilities under 
both scoring systems; see, e.g., Simmons (1989). This is all the more surprising 
since there have been, in the sport community, much debate and questions about 
how much the duration of the game is affected by the scoring system. Moreover, it 
is usually reported that the main motivation for turning to the rally-point system 
is to regulate the playing time (that is, to make the length of the match more 
predictable), which is of primary importance for television, for instance. Whether 
the transition to the rally-point system has indeed served that goal, and, if it has, to 
what extent, are questions that have not been considered in the literature, and were 
at best addressed on empirical grounds only (by international sport federations). 

In this section, we will provide an in-depth comparison of the two scoring sys- 
tems, both in terms of game- winning probabilities and in terms of durations, which 
will provide theoretical answers to the questions above. Again, this is made possi- 
ble by our rally-level analysis of the game and the results of the previous sections 
on the distribution of the number of rallies under the side-out scoring system. As 
we will discuss in Section [6j our results are potentially of high interest both for 
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international federations and for local tournament organizers. 



4.1 Game-winning probabilities. 

Although the game-winning probabilities for an A-game played under the rally- 
point system have already been obtained in the literature (see, e.g., Simmons 1989), 
we start by deriving them quickly, mainly for the sake of completeness, but also 
because they easily follow from the combinatorial methods used in the previous 



sections. First note that there cannot be exchanges (in the sense of Definition 2.1 1 
in the rally-point scoring system. We then denote by E'^'^''^{r) (C G {A,B}) the 
event associated with a sequence of rallies that, in the rally-point system, (i) gives 
raise to a points scored by Player A and {3 points scored by Player B, (ii) involves 
exactly r A-interruptions, and (iii) is such that Player C scores a point in the last 
rally; again, it is assumed here that a > (resp., (3 > Q) iiC = A (resp., if C = B). 
We write 

Pcf '""'(0 := P[^"^^'^nr)|5 = Ci], Ci,C2 G {A,B}. 



The following result then follows along the same lines as for Lemma [2?T| and Theo- 
rem O 

Theorem 4.1 (i) With the notation above, p'^'^'^ir) = {") Czl)Pa^''Pb^'' {qaqbY , 
r e {70, . . . ,7i}, andff/'''{r) = J O^r'^+Vf "''J(9aib)'^-\ r € {1, ... ,72+ 
1}. (ii) Writing p°^^''~^ for the probability of the event E'^^'^ := ij^ E'^^''~^ [r) , we 

havev'^/^^=v>iY:ri,. -rfpi'"'^ =p>r^'?aE:^ir C-jei) 

(toXhf \ where we let ta = Qa/Pa and tb = qb/pb- 



Remark 4.1 These expressions further simplify in the no-server model {p :=)pa = 
1 — Pb- There we indeed have tb = t'^^^ , so that the above formulas yield p°^^'^ = 
n')p'^{l-pY andp'^/^'' = ("+f-i)p"(l 

Of course, the resulting score distribution and game-winning probabilities for 
an A-game directly follow from Theorem |4.1| In accordance with the notation 
adopted for the side-out scoring system, we will write 

n-l 

P[E^\S = C] := P[Ulz',E-^''^\S = C] := Y^P^c"-^, pI := 1 -p^, 

fe=0 

-n.UA p[E"^k,A^ = p'^x''^s,+p'^/'^sb,p''-''-'' P[^'^--"'^] = 
and 

pC* :=P[E^]^p'lsa+P%Sb. 

Figures |4]ja)-(b) plot the same score distribution curves as in Figures [l]^a)-(b), 
respectively, but in the case of an A-game played under the rally-point scoring 
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system with n ~ 21. Both pairs of plots look roughly similar, although extreme 
scores seem to be less likely in the rally-point scoring; this confirms the findings 
from Simmons (1989) according to which shutouts are less frequent under the rally- 
point scoring system. Note also that, unlike for the side-out scoring, the (pa,Pb) = 
(.5, .5) curve in Figure |4]ja) is the exact reverse image of the corresponding one 
in Figure |4]jb) : for the rally-point scoring. Player A does not get any advantage 
from serving first if {pa,Pb) — (-5, .5), which is confirmed by the game- winning 
probabihties = Pa = 

Again, the dependence of the game- winning probabilities on {pa,Pb) is of pri- 
mary importance. We will investigate this dependence visually and compare it 
with the corresponding dependence for the side-out scoring system. To do so, we 
focus on the no-server version (p = pa = 1 — pt) of Badminton, where, as already 
mentioned, the side-out scoring system with n = 15 (men's singles) was recently 
replaced with the rally-point one characterized by n = 21. The results are reported 
in Figures |5][a)-(b). Figure [5ja) supports the claim — reported, e.g., in Simmons 
(1989) or Percy (2009) — stating that, for any fixed p, the scoring barely influences 
game- winning probabilities. Now, while Figurejsjb) shows that the probability that 
Player A wins an y4-game is essentially the same for both scoring systems if he/she 
is the best player (pi/pi e (.926, 1) for p > .5, and pi/pi G (.997, 1) for p > .7), it 
tells another story for p < .5: there, the probability that A wins an ^-game played 
under the rally-point system (i) becomes relatively negligible for very small values 
of p (in the sense that Pa/Pa — >■ as p — > 0) and (ii) can be up to 28 times larger 
than under the side-out system (for values of p close to .1). Of course, one can 
say that (i) is irrelevant since it is associated with an event (namely, a victory of 
A) occurring with very small probability; (ii), however, constitutes an important 
difference between both scoring systems for values of p that are not so extreme. 



4.2 Durations. 



In the rally-point system, the number of rallies needed to achieve the event E^'^^'-^^C] 
[S = Ci] is not random: with obvious notation, it is almost surely equal to e'^'f''^''^ = 
a + f3, which explains why Figure [4] does not contain the rally-point counterparts of 
Figure [ijc)-(f). The various conditional and unconditional means and variances of 
the number of rallies in the rally-point system (that is, the quantities e^, Vq, ec, 
vc, e^, v"^, e, v) can then be readily computed from the game- winning probabilities 
given in Theorem 4.1 in the exact same way as in Q-Q for the sidc-out scoring 
system. More generally, the corresponding distribution of the number of rallies in 
a game trivially follows from the same game-winning probabilities. 

Figures [5]jc)-(h) plot, as functions of p = pa = 1 — pt (hence, in the no-server 
model), expected values and standard deviations of the numbers of rallies needed 
to complete (i) A-games played under the side-out system with n — 15 and (ii) A- 
games played under the rally-point system with n = 21. Clearly, those plots allow 
for an in-depth original comparison of both scoring systems. Let us first focus on 
durations unconditional on the winner of the game. Figure [sj^c) shows that (i) 
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games played under the side-out system will last longer than those played under 
the rally-point one for players of roughly the same level (which was expected since 
the side-out system will then lead to many exchanges), whereas (ii) the opposite is 
true when one player is much stronger (which is explained by the fact that shutouts 
require more rallies in the rally-point scoring considered than in the side-out one). 
Maybe less expected is the fact (Figure [Sjf)) that the standard deviation of D is, 
uniformly in p G (0, 1), smaller for the rally-point scoring system than for the side- 
out system, which shows that the transition to the rally-point system indeed makes 
the length of the match more predictable. The twin-peak shape of both standard 
deviation curves is even more surprising. Finally, note that, while the rally-point 
curves in Figures [sjc) and (f) are symmetric about p = .5, the side-out curves are 
not, which is due to the server-effect. This materializes into the limits of ca given 
by 16 and 15 as p ^ and p — 1, respectively (which was expected: if Player B 
wins each rally with probability one, he/she will indeed need 16 rallies to win an 
A-game, since he/she has to regain the right to serve before scoring his/her first 
point), but also translates into (i) the fact that the mode of the side-out curve in 
Figure [sf^c) is not exactly located in p = .5 and (ii) the slightly different heights of 
the two local (side-out) maxima in Figure [sjf). 

We then turn to durations conditional on the winner of the game, whose ex- 
pected values and standard deviations are reported in Figures [5|^d) , (e), (g), and 
(h). These figures look most interesting and reveal important differences between 
both scoring systems. Even the general shape of the curves there are of a different 
nature for both scorings; for instance, the rally-point curves in Figures [5jd)-(e) are 
monotonic, while the side-out ones are unimodal. Similarly, in Figure [sjjg) , the 
rally-point curve is unimodal, whereas the side-out curve exhibits a most unex- 
pected bimodal shape. It is also interesting to look at limits as p — > or p — 1 in 
those four subfigures; these limits, which are derived in Appendix |A.3[ are plotted 
as short horizontal lines. Consider first limits above events occurring with probabil- 
ity one, that is, limits as p — >■ 1 in Figures [sjd), (g) and limits p — ?> in Figuresjsje), 
(h). The resulting limits are totally in line with the intuition: the four conditional 
standard deviations go to zero, which implies that the limiting conditional distribu- 
tion of D simply is almost surely equal to the corresponding limiting (conditional) 
expectations. The latter themselves assume very natural values: for instance, for 
the same reason as above, converges to 16, which is therefore the limit of D in 
probability. 

Much more surprising is what happens for limits above events occurring with 
probability zero, that is, limits as p in Figures [5jd), (g) and limits as p — > 1 
in Figures |5je), (h). Focussing first on the side-out scoring system, it is seen 
that a (miraculous) victory of A will require, in the limit, almost surely D = 15 
points, while the limiting conditional distribution of D for victories of B is non- 
degenerate. The latter distribution is shown (see Appendix A. 3 1 to be uniform 
over {n -|- 1, n -|- 2, . . . , 2n} (hence is stochastically bounded!), which is compatible 
with the values n+l + (ra—l)/2(w 3ri/2) and (n— 1)^/12 for the limiting expectation 
and variance, respectively. It should be noted here that this huge difference between 
those two limiting conditional distributions of D is entirely due to the server-effect. 
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In the absence of the server-effect, the subfigures (e) and (h) should indeed be 
the exact reverse image of the subfigures (d) and (g), respectively. Similarly, the 
bimodality of the side-out curve in Figure [sjjg) is also due to the server- effect. We 
then consider the rally-point scoring, which is not affected by the server-effect, 
so that it is sufficient to consider at the limits as p — >■ in Figures [sjd), (g). 
There, one also gets a non-degenerate limiting conditional distribution for D, with 
expectation 2n^/(n -|- !)(« 2n) and variance 2ii?{n — !)/[(« + ^Y{n + 2)](w 2). 



5 Simulations. 

We performed several Monte Carlo simulations, one for each figure considered so 
far (except Figure [2] as it already contains many theoretical curves). To describe 
the general procedure, we focus on the Monte Carlo experiment associated with 
the side-out scoring system in Figure [s] (results for the rally-point scoring system 
there or for the other figures are obtained similarly). For each of the 1, 999 values 
of p considered in Figure [S] the corresponding values of p^(p), eA{'p)^ va{p), e^(p), 
z;^(p), C £ {A, B}, were estimated on the basis of J = 200 independent replications 
of an A-game played under the side-out scoring system with pa ~ \ — Pb = P- Of 
course, for each fixed p, the game- winning probability p^ip) is simply estimated by 
the proportion of games won by C in the J corresponding ^-games: 

where /j^, j = 1, . . . , J, is equal to one (resp., zero) if Player C won (resp., lost) the 
jth game. The corresponding estimates for e^(p), va(p), e^(p), and v'^{p) then 
are given by 

1 '' I 2 

.J ,/ 
jcE^^-^f' vC{p):^—Y^{d,-e'i{p)flf, (7) 

i=i i=i 
where dj, j = 1,..., J, is the total number of rallies in the jth game. These 
estimates are plotted in thin blue lines in Figure [Sj Clearly, these simulations 
validate our theoretical results in Figures [5](a), (c), and (f). To describe what 
happens in the other plots, consider, e.g.. Figure [5|g). There, it appears that 
the theoretical results are confirmed for large values of p only. However, this is 
simply explained by the fact that for small values of p, the denominator of ■O^(p) 
(see ([7])) is very small. Actually, among the 542 x 200 A-games associated with 
the 542 values of p < .2710, not a single one here led to a victory of A, so that 
the corresponding estimates 'C^(p) are not even defined. Of course, values of p 
slightly larger than .2710 still give raise to a small number of victories of A, so that 
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the corresponding estimates v^{p) are highly unrehable. The situation of course 
improves substantially as p increases, as it can be seen in Figure [5jg). Figures [5jb), 
(d), (e), and (h) can be interpreted exactly in the same way. 

This underlines the fact that expectations and variances conditional on events 
with small probabilities are of course extremely difficult — if not impossible — to 
estimate. To quantify this, let us focus again on Figure [sf^g), and consider the 
local maximum on the left of the plot, which is (on the grid of values of p at 
hand) located in po '■— -0085. The probability p^{po) of a victory of A in an A- 
game played under the side-out scoring system with p = pq is about 3.5 x 10"'^^. 
Estimating v^{po) with the same accuracy as that achieved for, e.g., v^{.5) in 
Figure [sjg) would then require a number of replications of (fixed po) j4-games 
that is about 200 x p^(.5)/p^(po) ~ 3 x 10'^^. Assuming that 10^ replications 
can be performed in a second by a super computer (which is overly optimistic), 
this estimation of w^(po) would still require not less than 9.5 x 10"'^^ years! This 
means that it is indeed impossible to estimate in a reliable way the conditional 
variance curve for p close to po, hence that there is no way to empirically capture 
the convergence of w^(p) to as p — >■ 0. Without our theoretical analysis, there 
is therefore no hope to learn about the degeneracy (resp., non-degeneracy) of the 
limiting distribution of D conditional on a victory of A as p — >■ (resp., conditional 
on a victory of i? as p — > 1). 

We will not comment in detail the Monte Carlo results associated with the other 
figures. We just report that they again confirm our theoretical findings, whenever 
possible, that is, whenever they are not associated with conditional results above 
events with small probabilities. 



6 Conclusion and final comments. 

This paper provides a complete rally-level probabilistic description for games played 
under the side-out scoring system. It complements the previous main contributions 
by PhiUips (1978), Strauss and Arnold (1987), and Simmons (1989) by adding to 
the well-known game-winning probabilities an exhaustive knowledge of the random 
duration of the game. This brings a much better understanding of the underlying 
process as a whole, as is demonstrated in Sections [2] to |4] 

In this final section, we will mainly focus on the practical implications of our 
findings. For this, we may restrict to {pa,Pb) G [-4, .6] x [.4, .6], say, since players tend 
to be grouped according to strength. For such values of the rally-winning probabili- 
ties, our results show that the recent transition — in mens' singles' Badminton — from 
the n = 15 side-out scoring system to the n — 21 rally-point one strongly affected 
the properties of the game. They indeed indicate that (i) games played under the 
rally-point scoring system are much shorter than those played according to the side- 
out one, and that (ii) the uncertainty in the duration of the match is significantly 
reduced. Our results allow to quantify both effects. On the other hand, they show 
that game- winning probabilities are essentially the same for both scoring systems. 



16 



It is then tempting to conclude (as in Simmons (1989) and Percy (2009)) that 
the outcomes of the matches are barely influenced by the scoring system adopted. 
While this is strictly valid in the model, it is highly disputable under possible vio- 
lations of the model (stating, e.g., that players might get tired at different speeds) 
which, given the reduced duration of the game emphasized by our analysis, may 
appear quite relevant. 

In practice, the results of this paper can be useful to many actors of the sport 
community. For the international sport federations playing with the idea of replac- 
ing the side-out scoring system with the rally-point one, our results could be used to 
tune n (i.e., the number of points to be scored to win a rally-point game) according 
to their wishes. For the sake of illustration, consider again the transition performed 
by the International Badminton Federation (IBF). Presumably, their objective was 
(i) to make the duration of the game more predictable and (ii) to ensure that the 
outcome of the matches would change as little as possible. If this was indeed their 
objective, then our results show that it has been partially achieved. However, it is 
now easy to see that other choices of n would have been even better in that respect, 
the choice of n = 27 (see Figure |6jd) and (b)), being optimal. Moreover, this last 
choice would have affected the duration of the game much less than n — 21 (see 
Figure [6]^c)), and thus would have made the outcome of the matches more robust 
to possible violations of the model. 

For organizers of local tournaments played under the side-out scoring system, 
our results can be used to control, for any fixed number of planned matches, the 
time required to complete their events. Such a control over this random time, at any 
fixed tolerance level, can indeed be achieved in a quite direct way from our results 
on the duration of a game played under the side-out scoring system. Organizers can 
then deduce, at the corresponding tolerance level, the number of matches — hence 
the number of players — their events will be able to host. This of course concerns the 
sports that are still using this scoring system, such as racquetball and squash (for 
the latter, only in countries currently using the so-called English scoring system). 

Finally, our results also open the way to more efficient estimation of the rally- 
winning probabilities {pa,Pb) in the side-out scoring system. Consequently, they po- 
tentially lead to more accurate ranking procedures (based on Bradley-Terry paired 
comparison methods), which is of course of high interest to national and interna- 
tional federations still supporting that scoring system. A full discussion of this 
is beyond the scope of this paper (and is actually the topic of Paindaveine and 
Swan 2009), and we only briefly describe the main idea here. Essentially, the re- 
sults of the present work, in a point estimation framework, enable us to perform 
maximum likelihood estimation of {pa,Pb) based on game scores and durations. It is 
natural to wonder how much the resulting estimators would improve on the purely 
score-based maximum likelihood estimators proposed by Strauss and Arnold (1987). 
It turns out that the improvement is very important. First of all, unlike the purely 
score-based estimators, which require numerical optimization techniques, the score- 
and-duration-based ones happen to allow for elegant closed form expressions. Sec- 
ond, they can be shown to enjoy strong flnite-sample optimality properties. Last 
but not least, they are much more accurate than their Strauss and Arnold (1987) 
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competitors, especially so for small numbers of observations (i.e., games). This 
is illustrated in Figure [7j where it can be seen that even for as little as m = 2 
games, the score-and-duration-based maximum likelihood estimators outperform 
their competitors both in terms of bias and variability. Clearly, this will have price- 
less practical consequences for the resulting ranking procedures, since any fixed pair 
of players typically do not meet more than once or twice in the period (usually one 
year) on which this ranking is to be computed. 



A Appendix. 

A.l Proof of Lemma 12.11 and Theorem 12.11 

In the Appendix, we simply write interruptions for A-interruptions. 

Clearly, p'^'^'^irj) = Kr^j PaPliqaqbf^^ where Kr,j is 



2.1 



Proof of Lemma 

the number of ways of setting r interruptions and j exchanges in the sequence of 
rallies achieving the event under consideration. Regarding interruptions, we argue 
as in Hsi and Burich (1971), and say those r interruptions should be put into the 
a possible spots (remember the last point should be won by A) , while the /3 points 
scored by B should be distributed among those r interruptions — with at least one 
point scored by B in each interruption (so that there may be at most r = min(Q;, /3) 
interruptions). There are exactly (") (^Zi) ways to achieve this. As for the j 
exchanges, they may occur at any time and thus there are as many ways of placing 
j interruptions as there are distributions of j indistinguishable balls into a-\- 13 urns, 
i.e. ("^^^^). Summing up, we have proved that 

with r = niin(/3, 1), . . . , min(Q:, /?), j G N. 

As for p°^^''^ {r, j), this probability is clearly of the form Lrj P^Pb^ailaqbY^''^^ ■ 
In this case, there are a + 1 possible spots for the r interruptions. But since B 
scores the last point, the sequence of rallies should end with an interruption. There 
are therefore (,."]^) ways to insert the interruptions. Each interruption contains 
at least one point for B, so that r < min(a + l,/3). The result follows by noting 
that there are {^ZD ways of distributing the f3 points scored by B into those r 
interruptions, and by dealing with exchanges as for p'^^'^{r,j). □ 



Proof of Theorem |2.1[ The result directly follows from Lemma |2.1| by writ- 
ing p^''^'^ = J2r,jP'A^i'^^j) and p'^'^'^ = J2r,jPA'^'^i'^yj) (whcrc the sums 
are over all possible values of r and j in each case), and by using the equality 
E," C"+r')^' = (1 - ^)"" any z e [0, 1). □ 
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A. 2 Proof of Theorems |3.1| and |3.2| and of Corollary |3.1 



Proof of Theorem |3.1[ First note that if A scores the last point in an A-game 
in which the score is a to /3 after j exchanges {j G {0, 1, . . .}) and r interruptions 
{r e {70, . . . , 71}), then there have been a + 13 + 2{r + j) ralhes . Conditioning on 
the number of interruptions and exchanges therefore yields 

(where the sums are over all possible values of r and j in each case) and thus, from 
Lemma [2.1 1 and Theorem |2.1[ 



(1 OitDi"- 



The first claim of Theorem 13.11 follows . 

For the second claim, it suffices to note that if B scores the last point in an 
A-game in which the score is of a to /? after j exchanges {j G {0, 1, . . .}) and r 
interruptions (r € {1, . . . , 72 + 1}), then the number of rallies equals a + /3 + 2(r — 
I + j) + 1; the computations above then hold with only minor changes. □ 



Proof of Corollary |3.1[ Taking first and second derivatives of the moment 
generating functions yields the expectations and variances given in Corollary |3.1[ 
Moreover it can easily be seen that derivatives of the expected values with respect 
to q are positive by using the Cauchy-Schwarz inequality, and thus the latter are 
strictly monotone increasing in q. □ 



Proof of Theorem 3.2 The change of variables 2; = e* in the moment gen- 
erating functions given in Theorem |3.1| immediately yields the probability gener- 
ating functions. If /3 = 0, the latter is already in the form of an infinite series 



G'^'iz) = E,°lo(l - If /3 > 0, we have 



71 



j=0 r=l 

where Kj = and Wr = W'^''^'^{q,r). This double sum satisfies 

00 71 71 /^^^ \ °° 

Y: k, e wrz^'^^^^ = E E w^.- + E 

j=0 r=l j = l \l=0 / 3=71 + 1 




The same arguments are readily adapted to G'^'^'^(z), and Theorem 
□ 



3.2 



follows. 
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A. 3 The distribution of the number of ralhes , in the no- 
server model, for extreme values of the rally-winning 
probabilities. 



As announced in Section [4.2[ we determine here the hmiting behavior of the number 
of ralhes Z?, in the no-server model, for p — > and p — !■ 1, conditional on the winner 
of the ^-game considered. We start with the limit under almost sure events, that 
is, limits as p — > 1 (resp., p — )■ 0) for the distribution of D conditional on a victory 
of A (resp., of B). 

Proposition A.l Let, for the side-out scoring system, t H> AI^{t) — File*^ \ E'~' , S — 
A], C G {A,B}, be the moment generating function of D conditional on the 
event f] [S = A]. Denote by t ^ M^it) = E[e*^\E^,S ^ A], C e {A,B}, 
the corresponding moment generating function for the rally-point system. Then, (i) 
asp ^ I, M^{t) -> e"* and M^{t) e"*; (li) as p ^ 0, M^{t) e("+i)* and 

Proof: (i) By conditioning, we get Mi{t) = Ylll M'X''-'^{t)p''/'^ /pi- It IS easy 
to check that hmp_>.i p^'*^' /p^ = 5kfi and that limp_>.i M"^^' (t) — e("+'=)*. Hence 
limp^iM^(i) = e"*. Likewise, M^{t) = ^^=0 e("+'=)*p^''''^/p^. Again, it is easy 
to check that hmp_>i p^'^^'^^/p^ — 6kfl- Hence, we indeed have M^{t) — e"*. (ii) 
The proof is similar, and thus left to the reader. □ 

Corollary A.l (i) As p ^ 1, (e^,w^) {n,0) and (e^,w^) («,0), so that, 

conditional on a victory of A in an A-game, D ^ n, irrespective of the scoring 
system; (ii) as p ^ 0, {e^,v^) (n + 1,0) and (e^,w^) — [n^Q), so that, 

conditional on a victory of B in an A-game, I? — ?> n + 1 (resp., n) for the side-out 
(resp., rally-point) scoring system. 



As shown by Proposition A.l and Corollary |A.1[ the situation is here very clear. 
In each of the four cases considered, only one trajectory is possible, namely that 
for which all rallies in the game will be won by the winner of the game. 

Next we derive the limiting conditional distribution of D under events which 
occur with zero probability, that is, limits as p ^ 1 (resp., p — )■ 0) for the distri- 
bution of D conditional on a victory of B (resp., of A). Our conclusions are much 
more surprising. 

Proposition A.2 Let m{t) Efe^o 6^"+'=)* ("+^')/ E^^o ("^fe"') • Then, (i) 
asp ^ M^{t) e"* and M^{t) m{t): (li) as p ^ I, Mf{t) (e("+i)* - 
g(2n+i)t-j^j-j^j^-[^ — e*)) and M^{t) m(t). In particular, as p ^ 1, the limiting 
distribution of D conditional on the event E^ O [S ^ A] is uniform over the set 
{n-\-l,...,2n}. 
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Proof: We first prove the assertions for the raUy-point scor ing system. In this 

it is immediate that 



4.1 



case, M^{t) = Y^^Z^ e^'^+'^^^PA''^ IPa- Now, from Remark 

linip^oPA'^'^M = ("\'"')/Efc:d("+fe"')' ^hich proves the claim for Mi{t) 
(hence, by symmetry, also for M^{t)). 

Next consider the assertions for the side-out scoring system. First note that, as 
before, M^it) = ^Vo M-/'\t)p-/^^ /pi and = Ylll M5'"'^(t)p^-"'^/p^ 



Now fix fc € {0, . . . , n — 1}. Using Theorem 2.1 one readily shows that 



^j^^qPa'''^ /pa = 4,0 and hjn = l/n. 

Combining these results and the definitions of the moment generating functions, it 
is then straightforward to show that 

lim Ml^'^'^U) = and lim m5'"'^(0 = e("+'=+i)*. 

The claim follows. □ 
Corollary A.2 (i)Asp^O, (e%vi) ^ {n,Q) and {ei^vi) ^ (|^, ; 

asp^ i, (eA,V^) 1^—7 12 ) V^A'^aJ ^ l^i+T' (n+l)2(„+2)^ 

It is remarkable that we can again give a complete description of the "distri- 
bution of the process" (by this, we mean that we can again list all trajectories of 
rallies leading to the event considered, and give, for each such trajectory, its proba- 
bility). Consider first the side-out scoring system. For victories of A, the situation 
is very clear: Corollary |A.2| indeed yields that, conditional on a victory of A in an 
A-game, D n as p 0, which implies that the only possible trajectory of rallies 
is the one for which all rallies in the game are won by A. Turn then to victories 



of B. There, we obtained in the proof of Proposition A.2 that all scores (fc,n) are 
equally likely. It is actually easy to show that, conditional on E^'^'^ C][S = A\, 
D — > n + k + 1 asp— > 1. This implies that there are exactly n equally likely 
trajectories: A first scores k points, then loses his/her serve, before B scores n 
(miraculous) points and wins the game (fc = 0, . . . , n — 1). 

Consider finally the rally-point system. In this case, it is sufficient to study the 
distribution of the scores after victories of A (when p — >■ 0) since the number of 
rallies is a function of the scores only, and since the conclusions will, by symmetry, 
be identical for victories of B (when p — 1). Clearly, for any fixed k e {0, 1, . . . , n — 
1}, there are exactly ("'''^~^) trajectories leading to the score {n,k), and those 
trajectories are equally likely. Each such trajectory will then occur with probability 



A.2 



the 



-'-/Efc=o ("^fe because, as we have seen in the proof of Proposition 
score (n, k) occurs with probability Efe=o ("'^fc"^)- These considerations 

provide the whole distribution of the process: there are X]fe=o equally likely 

possible trajectories, namely the ones we have just considered. The exact limiting 
distribution of D can of course trivially be computed from this 
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(a) (b) 




n 1 1 1 \ 1 1 r ^ 1 1 1 1 \ 1 r 

2 4 6 a 10 12 14 14 12 ID 8 6 4 2 



(e) (f) 




Figure 1: All subfigures refer to an A-game played under the side-out scoring 
system with n = 15. Left: for {pa,Pb) = (-7, .5), (.6, .5), (.5, .5), and (.4, .5), 
(a) probabilities p^^'^ that Player A wins the game on the score (n, k) 
(along with the probabilities that Player A wins the game), (c) expected 
values e^^' and (e) standard deviations {v^^' )^/^ of the numbers of ral- 
lies D conditional on the corresponding events (along with the expected val- 
ues and standard deviations (f^)^^^ of D conditional on a victory of A). 
Right: the corresponding values for victories of B on the score (k,n). As 
for the expected values and standard deviations of D unconditional on the 

1/2 

score or the winner, we have {eA,v^ ) = (33.5,8.6), (41.6,9.5), (48.7, 10.1), 
and (52.5,11.5), for {pa,Pb) = (-7, .5), (.6, .5), (.5, .5), and (.4, .5), respec- 
tively. Estimated probabilities, expec^tions, and standard deviations based 
on 5,000 replications are also reported (thinner lines in plots and numbers 
between parentheses in legend boxes). Dashed lines in (c) correspond to 
Simmons' (1989) lower and upper bounds in 




Figure 2: Both subfigures refer to an A-game played under the side-out 
scoring system with n = 15 and (pa^Pb) = (-6, .5). Subfigure (a) (resp., 
Subfigure (b)) reports, in black and as a function of k, the a-quantile of the 
number of rallies needed to complete the game, conditional on a victory of A 
on the score (n, k) (resp., conditional on a victory of B on the score {k, n)), 
with a = .01, .05, .25, .50, .75, .95, and .99. Solid lines (resp., dotted lines) 
correspond to standard (resp., interpolated) quantiles; see Section 3.2 for 
details. The green curves are the same as in Figure[T| hence give the expected 
values of D conditional on the same events. 
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Figure 3: All subfigures refer to an A-game played under the side-out scoring 
system with n = 15. For {pa,Pb) = (-7, -5), (.6, .5), (.5, .5), and (.4, .5), they 
report the probabilities that the number of rallies D needed to complete 
the game takes value d, (a) conditional upon a victory of Player A, (b) 
conditional upon a victory of Player B, and (c) unconditional. Empirical 
frequencies based on 20, 000 replications are also reported (thinner lines in 
plots and numbers between parentheses in legend boxes). 



25 



(a) (b) 




5 10 15 20 20 15 10 5 



Figure 4: Both subfigures refer to an A-game played under the rally-point 
scoring system with n = 21. Subfigure (a): for {pa,Pb) = (-7, .5), (.6, .5), 
(.5, .5), and (.4, .5), probabilities p^'^'^ that Player A wins the game on 
the score {n,k), along with the probabilities that Player A wins the 
game. Subfigures (b): the corresponding values for victories of B on the 
score {k,n). Estimated probabilities based on 5,000 replications are also 
reported (thinner lines in plots and numbers between parentheses in legend 
boxes) . 
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(b) 
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0.0 0.2 0.4 0.6 0.8 1.0 



Figure 5: As a function of p = pa = 1 — Pb (that is, in the no-server model), 
probabiHties p]^'^' (in blue) that Player A wins an n = 15 side-out A- 
game on the score (15, A;), along with the probabilities p^'^'^ (in red) that 
Player A wins an n = 21 rally-point ^-game on the score (21, fc). Expec- 
tations (first row) and standard deviations (second row) of the number of 
rallies needed to complete the corre^onding games, unconditional on the 
winner (first column) , conditional on a victory of Player A (second column) , 
and conditional on a victory of Player B (third column) . Estimated proba- 
bilities, expectations, and standard deviations (based on 200 replications at 
each value of p = 0, .0005, .0010, .0015, . . . , .9995) are also reported (thinner 
lines). 





Figure 6: Subfigures (a)-(e) here report Subfigures (a)-(c), (f), and (e) from 
Figure [5] with the only difference that the raUy-point scoring here is based 
on n = 27 (the side-out scoring is still based on n = 15). 
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Figure 7: Subfigure (a) (resp., (b)) is a scatter plot of the values of 
score-based (resp., score-and-duration-based) maximum likelihood estima- 
tors for {pa,Pb), from J = 1,000 replications of m = 2 side-out A-games 
with n = 15. 
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